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Abstract 

A certified static analysis is an analysis whose semantic validity has been formally proved cor- 
rect with a proof assistant. The recent increasing interest in using proof assistants for mechanizing 
programming language metatheory has given rise to several approaches for certification of static 
analysis. We propose a panorama of these techniques and compare their respective strengths and 
weaknesses. 


1 Introduction 

Nowadays safety critical systems are validated through long and costly test campaigns. Static analysis 
is a promising complementary technique that allows to automatically prove the absence of restricted 
classes of bugs. A significant example is the state-of-the-art ASTREE static analyzer for C [111 which 
has proven some critical safety properties for the primary flight control software of the Airbus A340 
fly-by-wire system. Taking note of such a success, the next question is: should we completely remove 
the test campaign dedicated to the same class of bugs? If we trust the result of the analyzer, of course 
the answer is yes, but should we trust it? The analyzer itself can be certified by testing, but exhaustivity 
cannot be achieved. In this paper, we show how mechanized proofs can be used to certify static analyzers 
or their results. 

Abstract interpretation [10] is a general theory that aims at designing provably correct static analyz- 
ers, but pencil-and-paper proofs hardly scale to real-size analyzers for real-size programming languages. 
Proof assistants allow to mechanically specify, program and prove correct static analyzers with respect 
to a formal model of the programming language semantics. If the feasibility of such a technique has 
been demonstrated for various kinds of analyses and programming languages [13, 2, 7, 18, 9, 5, 20], 
many approaches coexist and some of them differ in the kind of guarantee they give on the targeted 
static analysis. In this work, we make a comparison between different techniques, taking into account 
the proof effort, the obtained guarantee and maintenance problems. The paper is organized as follows: 
we first show how an analysis can be specified depending on the expected guarantees. We then address 
the question of computing a certified solution of the analysis. Finally we investigate the use of deductive 
verification to validate the invariant generated by an analysis. 

For presentation purposes, all the analyzes we describe here will share a common semantical basis, 
which we give below, together with a summary of abstract interpretation principles. A program P is 
a graph (. N,E ) where N > 1 is the number of vertices (control points), and E a set of edges (n.m) e 
{l.JV} x {l.JV}. Each edge is labeled by an instruction i nm from a set I. Among nodes, we distinguish 
an entry point n e such that there is no incoming edge in n e . A state of a program P is composed of a 
control point n and an environment p: State — {l.JV} x Env. The concrete semantics of an instruction 
i £ I is given by a binary relation — >,• over Env. The transfer function F , : & (Env) — > (Env) associated 
to instruction i is then defined by Fj(S) — {p 1 | 3p e S' : p — p'}. Given Sq an initial set of environments, 
the collecting semantics [7 5 ] c of P is the least solution X 6 FP (Env)^ of S„ e — So and Vm € { 1 ..N}, S m — 
Ti,™ (^») * 

The abstract interpretation formalism gives us a way to over-approximate the solution to these equa- 
tions. An abstract semantics is expressed w.r.t. an abstract domain 1 Env^ (usually a complete lattice 
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'Here we focus on the specific case where the abstract states are mappings from { 1 . -A/} to EnvF 
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(Env*,Cjj)), and the relation between concrete and abstract semantics is given by a Galois connec- 
tion (a, <^(Env) ,Env**, y), i.e. a pair of monotone mappings such that VS C Env : S C y(a(Sj) and 
\/d € Env 1 * : a(y(d)) Cjj d. The abstraction function a maps a set S of environments to an abstract en- 
vironment, which can be seen as the least property satisfied by all elements of S. The concretization 
function y maps an abstract environment to all the concrete environments satisfying the corresponding 
property. We are interested in computing an abstract semantics p 5 ]** : { 1 ..N} — > Env 1 * which is a certified 
correct ( over- )approximation of the concrete collecting semantics, i. e. Vi € { 1 . .N} : [P] c (i) C y( [P] ** (/) ) . 

In addition to the systematic treatment of this safety issue, the theory provides an optimal specifica- 
tion of the abstract semantics: given a mapping F : S? (Env) — > .'P (Env), a correct abstraction of F is a 
mapping F ** verifying F o y C yo F-, or equivalently a o F o y C. F : . The case where F : = a o F o y thus 
provides the best correct abstraction of F . The abstract semantics computed by the analyzer will then 
mimic the collecting semantics, in the sense that it also operates through abstract transfer functions Ff. 
The result \P\ : of the analysis is thus the least solution of 

«(5o)E # 5# e and V(n,/n)eP,^ m (5#)E«5i (1) 

and each Ff is proved an optimal correct abstraction of P,. In order to save space, all properties like 
a{S o) Cj Sf dealing with the treatment of initial states will be discarded from the rest of this paper. 

2 Analyzer Specification 

Analysis soundness must be established with respect to the concrete semantics using a correctness rela- 
tion that relates the concrete and the abstract domains. In this section we consider two formal frameworks 
that both enforce the following essential soundness property: any solution S' of (1) is a correct approxi- 
mation of [P] c . Various ways can be taken between these two opposite approaches, some of them have 
been investigated in [13, 7, 18, 9, 5]. 

2.1 Deep Analyzer Specification 


This first approach (so far only experimented in [15]) exactly follows the Galois connection formalism 
by providing mechanized proofs of all the classical properties. We briefly recall the components of a 
static analyzer based on this formalism and make more precise the proof requirements. 


component 

mathematical structure 

properties to prove 

abstract domain 

complete lattice ({1..V} — > Env**, £**,□**, H**) 

existence of a lub 

correctness relation 

Galois connection ( a , S? ( Env) , Env** , y) 

Galois connection definition 

abstract semantics 

abstract transfer functions 

Ff — aoFjoy 


Defining the abstract domain as a complete lattice constrains us to provide a proof of existence of 
a least upper bound for any subset of abstract elements (or a least fixpoint for any monotone function). 
Moreover, from a proof assistant point of view, the U** operator is not constructive, which hampers its 
implementation. Taking a Galois connection as a correctness criterion also increases the number of 
proofs to be done, since this also provides an optimality property. 

Let us take an example to illustrate this point. We consider a numerical abstraction based on the 
domain of intervals. The concrete domain is (<^(Z),C,u,n), and the abstract domain is the lattice 
of intervals {[a,b\ \ a 6 Z U { — °°},b gZU {+°°},a < b} U _l_i nt . The corresponding abstraction and 
concretization functions are defined by the following equations 

y(Tlnt) = 0 ^tntif/^0 

y{[a,b]) == {xeZ\a<x<b} • ’ \ [inf(P),sup(P)] otherwise 
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If we want to compute a o F o 7 for a given computable function F , the use of a might force us to provide 
a proof that a given subset of Z is not bounded to ensure that T = [—00. +00] is a correct result of the 
analysis. If no optimality property were involved, T could be taken as a correct result in any case. 

2.2 Shallow Analyzer Specification 

A second approach consists in focusing only on the soundness of the analysis, without considering the 
optimality issue. The choices made here are directly inspired from [12], where the authors describe the 
design of the ASTREE static analyzer. 


component 

mathematical structure 

properties to prove 

abstract domain 

M 

jh 

h 

no requirements on C ; , U* or □' 

correctness relation 

y : Env* — ► & (Env) 

Vc,d e Env^c Cjj d =>• 7(c) C y(d) 

abstract semantics 

abstract transfer functions 

soundness: Fj 0 7 C 70 Ff 


If the shallow framework requires far less machine proofs than the deep framework, it ensures only 
a minimal amount of properties on the analysis which is doubtless sound but may still contain several 
precision bugs that are notoriously hard to debug. 

3 Result Computation 

The requirements made during the previous phase ensure that any solution of (1) is a correct approxi- 
mation of [P] c , but do not specify how it is computed. One has to choose between various certification 
levels: from a complete proof of the whole analysis computation to a result-only certification. 

3.1 Termination 

If we aim at certifying the whole analyzer, we have to prove the termination of an algorithm computing 
a solution of ( 1). Termination proofs are known to be difficult and are seldom mechanized, or even pre- 
cisely formalized. Even if we consider a complete lattice, the existence of a least fixpoint for monotone 
functions does not ensure the convergence of the computation in finite time. Except for the trivial case 
of finite height lattices, we have to exhibit specific properties of the domain, or to design operators that 
will ensure convergence. 

A first approach consists in certifying that the lattice respects the ascending chain condition, i.e. that 
any increasing chain stabilizes in finite time. The main drawback of this approach is that it does not apply 
to popular abstract domains like intervals or polyhedra. A more common approach consists in designing 
widening and narrowing operators in order to accelerate the convergence [10]. In both approaches, 
a mechanized proof of termination has to cope with constructivity issues, and the criteria of ascending 
chain or termination of widening-based iteration have to be modified in consequence. In any case, a 
key issue for termination proofs is to provide modular lattice constructions, thus allowing for building a 
global proof out of basic blocks (usual numerical abstract domains for instance) [13, 17]. 

If one wants to avoid to perform tedious termination proofs, it is also possible to artificially bound 
the number of iterations in the abstract semantics computation [14]. In that case, one has to certify that 
any iteration yields a correct result, even if not the best one, or to check that the final result is indeed a 
correct approximation of the concrete semantics. This technique leads us to result certification. 

3.2 Result-only Certification 

In a safety-critical context, it is likely that the high confidence we are looking for is not for all results of 
the analyzer but for a few specific ones like those obtained for the next version of the flight-command 
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program. Instead of globally certifying the analyzer itself, it might be interesting to provide a tool 
that checks the correctness of its result. This approach is directly related to the Proof Carrying Code 
technique [16, 1], where the code producer provides a formal proof that the code respects some safety 
requirements defined by the end-user. The user then verifies the proof with an automated and trusted 
checker. This use of a PCC technique for analysis certification has been proposed in [6]. The main 
requirement is the same as in the previous approaches: we still have to give a proof that Fjoy C yoFf. 
But now, instead of computing a solution of (1), we just have to check that a given certificate S : is indeed 
a solution. Note that this proof can be automatically discharged by a computation. 

4 Deductive Verification of Analysis Results 

Since the main goal of static analysis is to generate invariants over program executions, it is natural to 
try to validate these invariants with deductive verification techniques that are traditionally applied for 
handwritten program invariants. In this section, we assume an axiomatic semantics given by a deductive 
judgment \~ {<j>} i {y} for each instruction i € I such that, when property (!) holds before executing i. iff 
must hold after. Here, (j) and y are formulas in a given logic language . We denote by n : { 0 } i { y } a 
Hoare-proof derivation n that is a valid proof of h { 0 } i { y } . We note p (= 0 when an environment p 
satisfies property (j). To validate a set of invariants 0 1 , . . . . p,y attached to each control point, it is sufficient 
to provide a set of Hoare proofs K njn : {0,,} i n m { <j > m } for all in. m) in E. 

An analysis result has to be first transformed into a set of assertions in the language . We thus 
assume that each abstract element a' € Env^ can be translated into a formula r « :n in Jz? . In this setting, 
an analysis result S : is certified once the following statements have been formally machine checked. 


Hoare logic soundness 

h \ 0 j i \ y \ implies (Vp , p' , p —>j p' and p \= A implies p' \= B) 

Provability of assertions 

V(n,m) e E, { r s£“ l } i, hm { r si“ l }is provable 


The advantage of the approach is that the soundness of Hoare logic can be proved once and for all, and 
used for several static analyses. If we assume that the Hoare logic is not only sound but also complete, 
that transformation r _l preserves satisfiability (i.e. for all p 6 Env and cr 6 Hnv : , p 6 y(a') iff p |— r a :n ), 
and that each transfer function Ff is sound w.r.t. /•), then the corresponding set of Hoare triples is provable 
for any solution S' of the analysis. However, a proof of these triplets still has to be constructed, without 
entering a painful manual process for each of them. A first approach, proposed by Seo et. al. [19], 
instruments the analyzer to make it produce a proof derivation n n , m for all edges (n.ni). This approach 
has also been followed by Beringer et. al. [3, 4] who translate type derivations into Hoare proofs. The 
approach proposed independently by Chaieb [8] relies on a weakest precondition computation. This 
time the analyzer is instrumented to produce proof terms for the verification conditions generated by a 
weakest precondition computation. If these approaches are elegant, they remain difficult to implement 
because generating proof terms requires more technical ability than transposing a pencil-and-paper proof 
into a proof assistant. 

Ideally, the proof obligations (obtained for example with the weakest precondition calculus) should 
be automatically discharged by a trustworthy theorem prover, hence the uselessness of generating proofs. 
However, each analysis may require specific decision procedures. Instead of producing a specific proof 
for each verification condition, we believe it may be a better idea to strengthen the theorem prover 
with a decision procedure able to discharge exactly this kind of formula. In addition to validating the 
analysis result, it is likely to improve the prover itself. Since each analyzer addresses dedicated decision 
procedures in its transfer functions and partial order tests, it would be useful to share this capability with 
an automatic prover. The research line we advocate here is then to design abstract domains and decision 
procedures in parallel. To ensure the validity of the approach, the decision procedures must themselves 
be certified, i.e. must generate proof terms of validity. 
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5 Conclusion 

We have considered several techniques for certifying the soundness of a static analyzer or of its result. 
Of course, there is no silver-bullet technique: if the deep approach is the most greedy in terms of proof 
effort, it is the only one that detects precision bugs. Revealing such bugs too late during a validation cam- 
paign may compromise the availability of the safety critical system that has to be validated in due time. 
Deductive verification appears as a promising technique but its apparent generality must be tempered: the 
underlying logic is not always expressive enough to translate the result of the analysis and automatically 
discharging verification conditions requires technical instrumentation of the analyzer. Building abstract 
domains and decision procedures in parallel seems an interesting research line to push further. 
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